CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialects

نویسندگان

  • Helen M. Meng
  • Chi-Kin Keung
  • Kai-Chung Siu
  • Tien Ying Fung
  • Pak-Chung Ching
چکیده

This paper describes CU VOCAL, a Chinese text-to-speech synthesis system that adopts the approach of corpus-based syllable concatenation. We have demonstrated the applicability of the approach primarily for Cantonese, a major dialect of Chinese predominant in Hong Kong, South China and many overseas Chinese communities. This work extends our previous work as described in [1]. Our approach is able to synthesize speech from free-form text, and it can also be optimized for response generation in specific application domains. We have also demonstrated the portability of the approach to Putonghua, the official Chinese dialect, in a domain-optimized setting. Coarticulatory context is expressed in terms of distinctive features. Tonal context is also included. We conducted a series of listening tests using CU VOCAL, which gave favorable performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recent enhancements in CU VOCAL for Chinese TTS-enabled applications

CU VOCAL is a Cantonese text-to-speech (TTS) engine. We use a syllable-based concatenative synthesis approach to generate intelligible and natural synthesized speech [1]. This paper describes several recent enhancements in CU VOCAL. First, we have augmented the syllable unit selection strategy with a positional feature. This feature specifies the relative location of a syllable in a sentence an...

متن کامل

The WISTON Text to Speech System for Blizzard Challenge 2010

The paper introduces the speech synthesis system developed by Institute of Automation, Chinese Academy of Sciences(CASIA) for Blizzard Challenge 2010. The large corpus based speech synthesis system, WISTON, was built to synthesize Mandarin speech. In this year, a new prosodic structure prediction model was used, which is more precise and compact than before. Furthermore, two kinds of syllable s...

متن کامل

Automatic Segmentation and Labeling for Mandarin Chinese Speech Corpus for Concatenation-based TTS

Corpus for Concatenation-based TTS Cheng-Yuan Lin, Jyh-Shing Roger Jang, Kuan-Ting Chen Multimedia Information Retrieval Laboratory Dept. of Computer Science National Tsing Hua University HsingChu, Taiwan +88635715131-3506 {gavins, jang, marco}@wayne.cs.nthu.edu.tw ABSTRACT Precise phone/syllable boundary labeling of utterances in a speech corpus plays an important role in constructing corpus-b...

متن کامل

Sub-syllabic Acoustic Modeling across Chinese Dialects

This paper presents a series of experiments on sub-syllabic unit selection across the two Chinese dialects – Mandarin and Cantonese. Evaluations are based on syllable recognition using only acoustic information, and no lexical knowledge is incorporated. We use a variety of subsyllabic acoustic models, motivated by phonological and lingustic structures charactersitics of Chinese. Our results sho...

متن کامل

Development of Concatenative Syllable based Text to Speech Synthesis System for Tamil

This paper addresses the problem of improving the intelligibility of the synthesized speech in Tamil TTS synthesis system. The human speech is artificially generated by Speech synthesis. The normal language text will be automatically converted into speech using Text-to-speech (TTS) system. This paper deals with a corpus-driven Tamil TTS system based on the concatenative synthesis approach. Conc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002